An exact nonparametric method for inferring mosaic structure in sequence triplets.

نویسندگان

  • Maciej F Boni
  • David Posada
  • Marcus W Feldman
چکیده

Statistical tests for detecting mosaic structure or recombination among nucleotide sequences usually rely on identifying a pattern or a signal that would be unlikely to appear under clonal reproduction. Dozens of such tests have been described, but many are hampered by long running times, confounding of selection and recombination, and/or inability to isolate the mosaic-producing event. We introduce a test that is exact, nonparametric, rapidly computable, free of the infinite-sites assumption, able to distinguish between recombination and variation in mutation/fixation rates, and able to identify the breakpoints and sequences involved in the mosaic-producing event. Our test considers three sequences at a time: two parent sequences that may have recombined, with one or two breakpoints, to form the third sequence (the child sequence). Excess similarity of the child sequence to a candidate recombinant of the parents is a sign of recombination; we take the maximum value of this excess similarity as our test statistic Delta(m,n,b). We present a method for rapidly calculating the distribution of Delta(m,n,b) and demonstrate that it has comparable power to and a much improved running time over previous methods, especially in detecting recombination in large data sets.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Inferring phylogenetic relationships avoiding forbidden rooted triplets

To construct a phylogenetic tree or phylogenetic network for describing the evolutionary history of a set of species is a well-studied problem in computational biology. One previously proposed method to infer a phylogenetic tree/network for a large set of species is by merging a collection of known smaller phylogenetic trees on overlapping sets of species so that no (or as little as possible) b...

متن کامل

An Investigation on characterization of cucumber mosaic virus isolated from lily green house in Damavand County, Iran

Background and Aims: Virus infections represent some of the most important diseases of lily, plants because of the devastating effects caused to the crops and the absence of effective treatments. A survey for virus diseases of lilies, revealed the occurrence of Cucumber mosaic virus (CMV) in plants growing in Tehran province, Iran. Materials and Methods: During 2013, 50 lily samples with virus-...

متن کامل

Towards an accurate identification of mosaic genes and partial horizontal gene transfers

Many bacteria and viruses adapt to varying environmental conditions through the acquisition of mosaic genes. A mosaic gene is composed of alternating sequence polymorphisms either belonging to the host original allele or derived from the integrated donor DNA. Often, the integrated sequence contains a selectable genetic marker (e.g. marker allowing for antibiotic resistance). An effective identi...

متن کامل

Methodology for Inferring Moral Priorities According to the Narrations of "Afal Tafzil"

Considering the different levels of moral values in Islam, in order to know the most important values and also to eliminate the contradiction, it is necessary to deduce from the texts of verses and hadiths. One of the most important aspects in these texts is the "structure of Tafzil". Some narrations of this structure indicate the priority of one or more values and others indicate a rule in det...

متن کامل

Bayesian approach to inference of population structure

Methods of inferring the population structure‎, ‎its applications in identifying disease models as well as foresighting the physical and mental situation of human beings have been finding ever-increasing importance‎. ‎In this article‎, ‎first‎, ‎motivation and significance of studying the problem of population structure is explained‎. ‎In the next section‎, ‎the applications of inference of p...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Genetics

دوره 176 2  شماره 

صفحات  -

تاریخ انتشار 2007